SQL SERVER – Shrinking NDF and MDF Files – Readers’ Opinion
Posted
by pinaldave
on SQL Authority
See other posts from SQL Authority
or by pinaldave
Published on Tue, 15 Jun 2010 01:30:58 +0000
Indexed on
2010/06/15
3:03 UTC
Read the original article
Hit count: 1536
Readers Contribution
|Readers Question
|sql
|SQL Authority
|SQL Query
|SQL Scripts
|SQL Server
|SQL Tips and Tricks
|T SQL
|Technology
Previously, I had written a blog post about SQL SERVER – Shrinking NDF and MDF Files – A Safe Operation. After that, I have written the following blog post that talks about the advantage and disadvantage of Shrinking and why one should not be Shrinking a file SQL SERVER – SHRINKFILE and TRUNCATE Log File in SQL Server 2008. On this subject, SQL Server Expert Imran Mohammed left an excellent comment. I just feel that his comment is worth a big article itself. For everybody to read his wonderful explanation, I am posting this blog post here. Thanks Imran!
Shrinking Database always creates performance degradation and increases fragmentation in the database. I suggest that you keep that in mind before you start reading the following comment. If you are going to say Shrinking Database is bad and evil, here I am saying it first and loud. Now, the comment of Imran is written while keeping in mind only the process showing how the Shrinking Database Operation works. Imran has already explained his understanding and requests further explanation. I have removed the Best Practices section from Imran’s comments, as there are a few corrections.
Before I explain to you the concept of Shrink Database, let us understand the concept of Database Files.
When we create a new database inside the SQL Server, it is typical that SQl Server creates two physical files in the Operating System: one with .MDF Extension, and another with .LDF Extension.
- .MDF is called as Primary Data File.
- .LDF is called as Transactional Log file.
If you add one or more data files to a database, the physical file that will be created in the Operating System will have an extension of .NDF, which is called as Secondary Data File; whereas, when you add one or more log files to a database, the physical file that will be created in the Operating System will have the same extension as .LDF.
The questions now are, “Why does a new data file have a different extension (.NDF)?”, “Why is it called as a secondary data file?” and, “Why is .MDF file called as a primary data file?”
Answers:
Note: The following explanation is based on my limited knowledge of SQL Server, so experts please do comment.
A data file with a .MDF extension is called a Primary Data File, and the reason behind it is that it contains Database Catalogs. Catalogs mean Meta Data. Meta Data is “Data about Data”. An example for Meta Data includes system objects that store information about other objects, except the data stored by the users.
sysobjects stores information about all objects in that database.
sysindexes stores information about all indexes and rows of every table in that database.
syscolumns stores information about all columns that each table has in that database.
sysusers stores how many users that database has.
Although Meta Data stores information about other objects, it is not the transactional data that a user enters; rather, it’s a system data about the data.
Because Primary Data File (.MDF) contains important information about the database, it is treated as a special file. It is given the name Primary Data file because it contains the Database Catalogs. This file is present in the Primary File Group.
You can always create additional objects (Tables, indexes etc.) in the Primary data file (This file is present in the Primary File group), by mentioning that you want to create this object under the Primary File Group.
Any additional data file that you add to the database will have only transactional data but no Meta Data, so that’s why it is called as the Secondary Data File. It is given the extension name .NDF so that the user can easily identify whether a specific data file is a Primary Data File or a Secondary Data File(s).
There are many advantages of storing data in different files that are under different file groups. You can put your read only in the tables in one file (file group) and read-write tables in another file (file group) and take a backup of only the file group that has read the write data, so that you can avoid taking the backup of a read-only data that cannot be altered. Creating additional files in different physical hard disks also improves I/O performance.
A real-time scenario where we use Files could be this one:
Let’s say you have created a database called MYDB in the D-Drive which has a 50 GB space. You also have 1 Database File (.MDF) and 1 Log File on D-Drive and suppose that all of that 50 GB space has been used up and you do not have any free space left but you still want to add an additional space to the database. One easy option would be to add one more physical hard disk to the server, add new data file to MYDB database and create this new data file in a new hard disk then move some of the objects from one file to another, and put the file group under which you added new file as default File group, so that any new object that is created gets into the new files, unless specified.
Now that we got a basic idea of what data files are, what type of data they store and why they are named the way they are, let’s move on to the next topic, Shrinking.
First of all, I disagree with the Microsoft terminology for naming this feature as “Shrinking”. Shrinking, in regular terms, means to reduce the size of a file by means of compressing it. BUT in SQL Server, Shrinking DOES NOT mean compressing. Shrinking in SQL Server means to remove an empty space from database files and release the empty space either to the Operating System or to SQL Server.
Let’s examine this through an example.
Let’s say you have a database “MYDB” with a size of 50 GB that has a free space of about 20 GB, which means 30GB in the database is filled with data and the 20 GB of space is free in the database because it is not currently utilized by the SQL Server (Database); it is reserved and not yet in use. If you choose to shrink the database and to release an empty space to Operating System, and MIND YOU, you can only shrink the database size to 30 GB (in our example). You cannot shrink the database to a size less than what is filled with data.
So, if you have a database that is full and has no empty space in the data file and log file (you don’t have an extra disk space to set Auto growth option ON), YOU CANNOT issue the SHRINK Database/File command, because of two reasons:
- There is no empty space to be released because the Shrink command does not compress the database; it only removes the empty space from the database files and there is no empty space.
- Remember, the Shrink command is a logged operation. When we perform the Shrink operation, this information is logged in the log file. If there is no empty space in the log file, SQL Server cannot write to the log file and you cannot shrink a database.
Now answering your questions:
(1) Q: What are the USEDPAGES & ESTIMATEDPAGES that appear on the Results Pane after using the DBCC SHRINKDATABASE (NorthWind, 10) ?
A: According to Books Online (For SQL Server 2000):
- UsedPages: the number of 8-KB pages currently used by the file.
- EstimatedPages: the number of 8-KB pages that SQL Server estimates the file could be shrunk down to.
Important Note: Before asking any question, make sure you go through Books Online or search on the Google once. The reasons for doing so have many advantages:
1. If someone else already has had this question before, chances that it is already answered are more than 50 %.
2. This reduces your waiting time for the answer.
(2) Q: What is the difference between Shrinking the Database using DBCC command like the one above & shrinking it from the Enterprise Manager Console by Right-Clicking the database, going to TASKS & then selecting SHRINK Option, on a SQL Server 2000 environment?
A: As far as my knowledge goes, there is no difference, both will work the same way, one advantage of using this command from query analyzer is, your console won’t be freezed. You can do perform your regular activities using Enterprise Manager.
(3) Q: What is this .NDF file that is discussed above? I have never heard of it. What is it used for? Is it used by end-users, DBAs or the SERVER/SYSTEM itself?
A: .NDF File is a secondary data file. You never heard of it because when database is created, SQL Server creates database by default with only 1 data file (.MDF) and 1 log file (.LDF) or however your model database has been setup, because a model database is a template used every time you create a new database using the CREATE DATABASE Command. Unless you have added an extra data file, you will not see it. This file is used by the SQL Server to store data which are saved by the users.
Hope this information helps.
I would like to as the experts to please comment if what I understand is not what the Microsoft guys meant.
Reference: Pinal Dave (http://blog.SQLAuthority.com)
Filed under: Readers Contribution, Readers Question, SQL, SQL Authority, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology
© SQL Authority or respective owner